The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scienti c Datasets
نویسندگان
چکیده
In an increasing number of scienti c disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental to the design of a data grid, namely, storage systems and metadata management. Next, we explain how these services can be used to develop higher-level services for replica management and replica selection. We conclude by describing our initial implementation of data grid functionality.
منابع مشابه
The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets
The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets Ann ChervenakŁ, Ian Foster†‡, Carl KesselmanŁ, Charles Salisbury† and Steven Tuecke† ŁInformation Sciences Institute, University of Southern California, USA †Mathematics and Computer Science Division, Argonne National Laboratory, USA ‡Department of Computer Science, The University of ...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملObject-Relational Queries into Multidimensional Databases with the Active Data Repository
As computational power and storage capacity increase, processing and analyzing large volumes of multi-dimensional datasets play an increasingly important role in many domains of scienti c research. Scienti c applications that make use of very large scienti c datasets have several important characteristics: datasets consist of complex data and are usually multi-dimensional; applications usually ...
متن کاملAn Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملTowards an Open Service Architecture for Data Mining on the Grid
Across a wide variety of fields, huge datasets are being collected and accumulated at a dramatical pace. The datasets addressed by individual applications are very often heterogeneous and geographically distributed, and are used for collaboration by the communities of users, which are often large and also geographically distributed. There are major challenges involved in the efficient and relia...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999